Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up the support for split queues #78274

Merged
merged 7 commits into from
Oct 3, 2024
Merged

Conversation

fpacifici
Copy link
Contributor

@fpacifici fpacifici commented Sep 27, 2024

#76410 introduces a way to deliver messages to a set of split celery queues rather than relying on a single queue for
post process.

This was used to address an incident as we were saturating Rabbit resources
for a single queue (which is single threaded). Splitting messages across multiple
queues solves that problem.

This PR introduce a configurable support for split queues for scenarios where we pass
the queue name to apply_async.
There is another PR to deal with tasks that define the queue in the task definition:
#76494

This introduces a router class SplitQueueRouter that maps a queue to a list of split
queues picked in a round robin way.
This router is used by the client code that schedules a task with apply_async.
The configuration is held in CELERY_SPLIT_QUEUE_ROUTES and all the
split queues have to be declared as any other queue.

Rollout procedure to remove the hack from #76410

Copy link
Member

@markstory markstory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. The rollout option default values will provide consistent behavior with the ops config patch that is in production currently.

@@ -821,6 +822,15 @@ def SOCIAL_AUTH_DEFAULT_USERNAME() -> str:
"sentry.integrations.tasks",
)

# tmp(michal): Default configuration for post_process* queueus split
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this still be considered temporary?

Suggested change
# tmp(michal): Default configuration for post_process* queueus split
# tmp(michal): Default configuration for post_process* queues split

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this will be replaced by CELERY_SPLIT_QUEUE_ROUTES below. I'd remove the comment when I remove the setting.

# Use legacy route
# This router required to define the routing logic inside the
# settings file.
return settings.SENTRY_POST_PROCESS_QUEUE_SPLIT_ROUTER.get(queue, lambda: queue)()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks compatible with the patch that is in getsentry/ops config overrides 👍

return ret


def make_split_queues(config: Mapping[str, SplitQueueSize]) -> Sequence[Queue]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't seem to call this outside of tests. Do we need to setup the post_process_transactions split queues, or are we going to rely on the patching being done in getsentry/ops for a while longer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a debate in the previous PR whether we should add the queues in sentry or getsentry.
#76494 (comment).
What do you think?

I am a bit biased towards adding them in the sentry configuration rather than getsentry to create fewer surprises.
Either way, in order to rollout safely, I need to deploy this first, then adapt the current patch to avoid queue duplication (if you declare the same queue twice two queues actually show up), then I can switch to using this method.

Comment on lines 34 to 35
legacy_mode = options.get("celery_split_queue_legacy_mode")
options.set("celery_split_queue_legacy_mode", [])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use override_options to modify options values via a context manager similar to override_settings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

damn, that's much better, thanks.

@fpacifici fpacifici enabled auto-merge (squash) October 3, 2024 19:57
@fpacifici fpacifici merged commit 52fbd94 into master Oct 3, 2024
50 checks passed
@fpacifici fpacifici deleted the fpacifici/cleanup_queue_split branch October 3, 2024 20:30
@github-actions github-actions bot locked and limited conversation to collaborators Oct 19, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants